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Abstract. The subset sum problem over finite fields is a well-known NP- 
complete problem. It arises naturally from decoding generalized Reed-Solomon 
codes. In this paper, we study the number of solutions of the subset sum prob- 
lem from a mathematical point of view. In several interesting cases, we obtain 
explicit or asymptotic formulas for the solution number. As a consequence, we 
obtain some results on the decoding problem of Reed-Solomon codes. 



1. Introduction 

Let F q be a finite field of characteristic p. Let D C F q be a subset of cardinality 
\D\ = n > 0. Let 1 < m < k < n be integers. Given m elements b\, ■ ■ ■ , b m in Fq. 
Let Vb,k denote the afhne variety in A fe defined by the following system of equations 



1<Z1 <22 

%h ' ' ' Xi k = b m , 

l<il<i2<-<i m <k 

Xi-Xj^Oii^j). 

A fundamental problem arising from decoding Reed-Solomon codes is to determine 
for any given b — (6i, • • ■ ,b m ) G F™, if the variety Vb,k has an F g -rational point 
with all Xi € D, see section 5 for more details. This problem is apparently difficult 
due to several parameters of different nature involved. The high degree of the va- 
riety naturally introduces a substantial algebraic difficulty, but this can at least be 
overcome in some cases when D is the full field F q and m is small, using the Weil 
bound. The requirement that the Xj's are distinct leads to a significant combina- 
torial difficulty. From computational point of view, a more substantial difficulty is 
caused by the flexibility of the subset D of F g . In fact, even in the case m = 1 
and so the algebraic difficulty disappear, the problem is known to be NP-completc. 
In this case, the problem is reduced to the well known subset sum problem over 
D C F g , that is, to determine for a given b G F q , if there is a non-empty subset 
{xi, X2, ■ ■ ■ , Xk} C D such that 

x 1 +x 2 + --- + x k =b. (1.1) 
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This subset sum problem is known to be NP-completc. Given integer 1 < k < n, 
and b € F q , a more precise problem is to determine 

N(k, b, D) = x 2 , ■ ■ ■ , x k } C D | xt + x 2 + ■ ■ ■ + x k = b}, 

the number of /c-element subsets of D whose sum is b. The decision version of the 
above subset sum problem is then to determine if N(k, b, D) > for some k, that 
is, if 



N(b,D) :=^TN{k,b,D) > 0. 



k=l 

In this paper, we study the approximation version of the above subset sum 
problem for each k from a mathematical point of view, that is, we try to approximate 
the solution number N(k, b, D). Intuitively, the problem is easier if D is close to be 
the full field F g , i.e., when q — n is small. Indeed, we obtain an asymptotic formula 
for N(k, b, D) when q — n is small. Heuristically, N(k, b, D) should be approximately 
i (^) . The question is about the error term. We have 

Theorem 1.1. Let p < q, that is, ~F q is not a prime field. Let D CF q be a subset 
of cardinality n. For any l<k<n<q — 2, any b 6 F q , we have the inequality 



N{k,b,D) 



1 / n 



q \k 



< 



<1 



pfk + q — n — 2 

q — n — 2 



fq/p - 1 
v lk/ P \ 



Furthermore, let D — F g \{ai,-- - ,a 9 _„} with a\ — 0, and ifb,a%,- 
linearly independent over F p; then we have the improved estimate 



,CLq 



are 



N(k, b,D)-- ''" 

q 



, v 

< max — 

0<j<k q 



k + q- 
<1 ' 



n - 
n - 



When q — p, that is, F q is a prime field, we have 



N(k,b, D) 



1 n 



-1)* 



k + q — n — 1 

q — n — 1 



< 



J 



q/p- 1 
[J/P\ 



k + q — n — 2 
q — n — 2 



Theorem 11.11 assumes that n < q — 2. In the remaining case n > q — 2, that 
is, n £ {q — 2, q — 1, g}, the situation is nicer and we obtain explicit formulas for 
N(k, b, D). Here we first state the results for q — n < 1 and thus we can take D = F q 
orF*. 

Theorem 1.2. Define v(b) = -1 tfb^O, and v(b) = q~l ifb = 0. Then 



N(k,b,F* q ) = - 



lfq-1 



+ (-1) 



fe+Lfc/pJ 



y(b) fq/p-i 
q V lk/ P \ 



If p\k, then 



If p\k, then 



N(k,b,F q ) = - 



N(k,b,F q 



l_ q 

q\k 



^k+kvjb) ( q/p 



+ (-1) 



q \k/p. 



When q — n = 2, note that we can always take D = F g \{0, 1}. 
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Theorem 1.3. Let q > 2. Then we have 

N(k, b, F q \{0, 1}) = ~ f q ~ 2 \ + k-l) k R 2 k - (-l) k S(k, k-b), 

where R^,, S(k,b) are defined as in (3.2) and (3.3). 

This paper is organized as follows: We first prove Theorem 1 1 . 21 and Theorem ll.3l 
in Section 2 and Section 3 respectively. Then we prove Theorem 11.11 in Section 4. 
Applications to coding theory are given in Section 5. 

Notations. For x E R, let (x) — 1 and (x)k = x(x — 1) ■ • • (x — k + 1) for 
k E Z+ = {1, 2, 3, • • • }. For k E N = {0, 1, 2, ■ ■ • } define the binomial coefficient 
CD = ^kf~ ■ For a real number a we denote [aj to be the largest integer not greater 
than a. 

2. Proof of Theorem 11.21 

When D equals q—1, it suffices to consider N(k, b, F*) by a simple linear substitu- 
tion. Let M(k, b, D) denote the number of ordered tuples (x\, X2, • • • , Xk) satisfying 
equation (1.1). Then 

M{k,b,D) = MN(k,b,D) 

is the number of solutions of the equation 

Xi H h x k = b,Xi £ D,Xi ^ Xj (i j). (2.1) 

It suffices to determine M(k, b, D). We use a pure combinatorial method to find 
recursive relations among the values of M(k, b, F q ) and M(k, b, F*). 

Lemma 2.1. For b^O and D being F q or F* q , we have M(k,b,D) = M(k, l,D). 

Proof. There is a one to one map sending the solution {x\, X2, • • ■ ,Xk} of (|2.1ll to 
the solution {xib -1 , • • ■ , Xkb^ 1 } of (|2.1j) with 6=1. □ 

Lemma 2.2. 

M(k, 1, F,) = M(k, 1, F*) + fcM(fc - 1, 1, F*), (2.2) 

M(k, 0, F q ) = M(k, 0, F*) + kM(k - 1, 0, F*), (2.3) 

(g) fc = {q- l)M(k, 1, F,) + M(fc, 0, P,), (2.4) 

( 9 - l) fc = (q - l)M(k, 1, F*) + M(fc, 0, F*). (2.5) 

Proof. Fix an element c E F 9 . The solutions of (|2 . 1 [) in F g can be divided into 
two classes depending on whether c occurs. By a linear substitution, the number 
of solutions of (|2.ip in F ? not including c equals M(fc, b — ck, F*). And the number 
of solutions of (|2.ip in F g including c equals kM(k — 1, 6 — cfc, F*). Hence we have 

M (fc, 6, F g ) = M(k, b - cfc, F*) + kM{k -l,b- ck, F*). (2.6) 

Then (|2.2[) follows by choosing 6 = l,c = 0. Similarly, (|2.3p follows by choosing 
b = 0, c = 0. Note that is the number of fc-permutations of F q , and (q — l)fe is 
the number of fc-permutations of F*. Thus, both (|2.4p and (|2.5p follows. □ 



The next step is to find more relations between M(fc, b, F q ) and M(fc, b, F*). 
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Lemma 2.3. If p] k, we have Al(k, b, F q ) = M (k, 0, F q ) for all b G F q and hence 

M(M,F g ) = i(g) fc . 

Ifp | k, we have M(k,b,F q ) = qM(k - l,b,F*) for all b e F 9 . 

Proof. Case 1: Since p\k, we can take c = k~ 1 b in (|2.6p and get the relation 

M(k, b, F q ) = M(k, 0, F*) + kM(k - 1, 0, F*). 

The right side is just M(k,0,F q ) by (f273|) . 

Case 2 : In this case, p \ k. Then M(k,b,F q ) equals the number of ordered 
solutions of the following system of equations: 

xi + x 2 + ■ ■ ■ + x k = b, 
x\ - x 2 = J/2, 



X\-x k = Vk, 

{ ^er; n^y,, 2<i<j<k. 

Regarding x±, X2, ■ • ■ ,Xk as variables it is easy to check that the p-rank (the rank 
of a matrix over the prime field F p ) of the coefficient matrix of the above system of 
equations equals k— 1. The system has solutions if and only if 2Z i=2 Vi = ~b and 6 
F* being distinct. Furthermore, since the p-rank of the above system is k — 1, when 
2/2i2/3j''' iVk and x\ are given then x 2 ,X3,--- ,Xk will be uniquely determined. 
This means the number of the solutions of above linear system of equations equals 
to q times the number of ordered solutions of the following equation: 

2/2 + 2/3 H 1- Vk = ~b, 

y l eF* q , yi^yj, 2<i<j<k. 

This number of solutions of the above equation is just M(k — 1, b, F*) and hence 
M(k,b,F g )=qM(k-l,b,F* g ). ' D 

We have obtained several relations from Lemma 12.21 and Lemma 12.31 To deter- 
mine M(k, b, F q ), it is now sufficient to know M(k, 0, F*). Define for k > 0, 

d k = M(k,l,F*)-M(k,0,F q ). 

Then by ()2.5() we have 

qM(k, 0, F*) = (q - l) k -{q- l)d k . (2.7) 

Heuristically, M(k, 0, F*) should be approximately |(<Z— l)fc- To obtain the explicit 
value of M(k, 0, F*), we only need to know d k . For convenience we set do = — 1. 

Lemma 2.4. If d k is defined as above, then 

-I, k = 0; 

1, k = l; 

-kdk-i, p\k, 2<k<q — l; 

(q - k)dk-\, p\k, 2 < k < q - 1 . 

Proof. One checks that di = M(l, 1,F*) - M(1,0,F*) = 1-0=1. When p \ k, 
by Lemma \2. 31 we have M(k, l,F g ) = M(k,0,F q ). This together with Lemma 
implies 

M(k, 1, F*) - M(k, 0, F*) = k(M(k - 1, 0, F*) - M(k - 1, 1, F*)). 



dk = 
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Namely, d k — —kdk-i- When p \ k, using Lemma 12731 we have 

M(k, 1, F,) - M(k, 0, F,) = q(M(k - 1, 1, F*) - M(k - 1, 0, F*)) = qd k -i. 
By Lemma l2~2l the left side is dk + fedfc-i- Thus, d k = (g — k)d k _x. □ 
Corollary 2.5. 

Proof. One checks do = —1 and d\ = 1 are consistent with the above formula for 
fc < 1. Let k > 2 and write k = np + m with < m < p. By Lemma 12.41 

— = f 1 w(p-l)+ro+l TT (g ~ fo) 
fc! 1 J I* ip 



It is easy to check that if q = p, then we have d k = (— 1) 1 kl, which is consistent 
with the definition (0)o = 1- □ 



Proof of Theorem 11.21 Let M(k,b,D) be the number of solutions of (|2.1 
Note that M(k,b,D) = k\N(k,b,D) and d k = ~(-l) k+Wpi kl( q /fi~}) . Thus it 
sufficient to prove 

M(k,b,K)= {q ~ 1)k ~ v(b)dk ; 

q 

M(k,b,F q )= iq)k ~ Vmk + Mk - l] 



9 

If b = 0, by (|2~7)) . we obtain 

qM(k,0,F*) = (q-l) k -(q-l)d k . 

If 6 ^ 0, then 

qM(k, b, F*) = qM(k, 1, F*) = gd fe + qM(k, 0, F*) = (g - l) fc + d k . 

The formula for M(k, b, F*) holds. 

If p { fe, then dfe + kd k ~\ = and the formula for M(k,b, F q ) holds by Lemma 



If p | fc, then dfe + kd k _i = qd k ^\. By Lemma 12.31 and the above formula for 
M(k,b,F*), we deduce 

M(fc, 6, F g ) - gM (fc - 1, 6, F*) = {q - l) fe _i - t>(&)d fc _i. 

The formula for M(k,b,F q ) holds. The proof is complete. 

Now we turn to deciding when the solution number N(k, b, F*) > 0. A sequence 
{ao, a\, ■ ■ ■ , a n } is unimodal if there exits index k with < k < n such that 

ao < a\ < ■ ■ ■ a k -i <a k > a k+1 ■■■> a n . 

The sequence {ao, ai, • ■ ■ , a n } is called symmetric if <Zj = a„_i for < i < n. 
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Corollary 2.6. For any b £ F q , both the sequence N(k,b,F q ) (1 < k < q) and the 

sequence N(k, b, F*) (1 < k < q — 1) are unimodal and symmetric. 

Proof. The symmetric part can be verified using Thcorcm ll.il A simpler way is to 
use the relation 

aeF„ a£F* 

To prove the unimodal property for N(k, b, F*), by the symmetry it is sufficient 
to consider the case k < Then, by Theorem II. 11 we deduce 

q(N(k,0,F q ) -N(k -l,0,F* q )) 

>( q ~ 1 )-( q ~ 1 )-(«-l)(( q/p - 1 )-( q/P ~ 1 
~\ k J \k-lj W [k/p\ J \l(k-l)/p\ 

If p f fc , then [k/p\ = [(k — 1) /p\ and the right side is clearly positive. If p | k, then 

q(N(k,0,F* q )- N(k -l,0,F* q )) 

> q-2k /q-l\ _ q/p-2k/pfq/p-l 



k \k — lj k/p \k/p—l 

-^(G:0-'-<:;))- 

When p = 2 and k = 2, 4, or q < 9, it is easy to checks that («~!|) >(q- 1) (fc/^Zi) ■ 
Otherwise by the Vandermonde's convolution 

q - A = (q/p-Af q- q/p 
k-ij ^ v i )\k-\-i 



it suffices to prove 

' q - q/p 

x k — k/p 

This inequality follows by noting that 



>«-!• 



q-q/p\ > 



v fc - k/p) ~ \ 2 

and q > 9. Thus N(k,0, F*) is unimodal. The proof for the unimodality of 
N(k,b,F q ) is similar. This completes the proof. 

□ 

Corollary 2.7. Lei |_D| = q—1 > 4. 7/p is an odd prime then for 1 < fc < q — 2 £/ie 
equation (1.1) always has a solution. If p = 2, then for 2 < k < q — 3 the equation 
(1.1) always has a solution. 

Proof. For any a £ F q we have iV(fc, 6, F 9 \{a}) = iV(fc,6 - fta,F*). Thus it is 
sufficient to consider N(k,l,F*) and N(k, 0,F*) by Lemma |2~T1 When p is odd 
and fc = 2, we have N(2, 0, FJ) = f (('j 1 ) + (« - 1)) = ^ > 0, and N(2, 1,F*) = 
|(( 9 2 1 ) - 1 ) = ^ > from TheoremOl Then, by the unimodality of N(k, 1, F*) 
and N(k, 0, F*), for 1< k < q - 2, 7V(fc, 6, F g \{a}) must be positive. 

Similarly, when p = 2 and fc = 3 we have N(3, 0, F*) = i^ 9 " 1 ) +(g-l)(f-l)) = 

(g ~ 1) 6 (g " 2) > and iV(3,l,P*) = " (I " 1)) = ? ^Kti) > . By the 

unimodality and symmetry we complete the proof. □ 
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Corollary 2.8. Let D = F q . If p is an odd prime then the equation (1.1) always 
has a solution if and only ifO<k<q. If p = 2, then for 2 < k < q — 2 the equation 
(1.1) always has a solution. 

Proof. It is straightforward from Corollary 12.71 and Theorem 11.11 □ 

3. Proof of Theorem 11.31 

Before our proof of Theorem II. 3i we first give several lemmas, which give some 
basic formulas for the summands of sign-alternating binomial coefficients. 

Lemma 3.1. Let k,m be integers. Then we have 

S'-O-'- 1 "-^ 1 )- 

Proof. It follows by comparing the coefficients of x m in both sides of (1 — x) _1 (l — 

x) r = (1 -xy- 1 . □ 

Lemma 3.2. Let < k > p be the least non-negative residue of k modulo p. For any 
positive integers a, k, we have 

g-<- i)l, " J U) - fejK 1 - < * > ' )< - i,l ' /pJ U 

and thus 

Proof. Let j = njp + mj with < mj < p. Applying Lcmma l3.ll we have 



ft 

[j/p\ 



= -pjr(-iyi( a ) + (p-i-<k> p )(-i) ( 

rij—0 v 

=-"(- 1 ) 1 * / ' J (^)^- 1 -< t >''<- 1 ) 1 *" J ( l v Pj )' 

The inequality (|3.1[) follows by noting the alternating signs before the two binomial 
coefficients. □ 

Lemma 3.3. Let R\ = (-l) fc ff = -(-l)L fe /pJ ( q /5~}). Let <k> p denote the least 

k 

3- 



non-negative residue of k modulo p. Define = X}j=o Rj ■ Then we have 

Rl = -K-i)^ (%-J) + (p - i- < * > f )(-D^ («*" 

Moreover, let b G F p . Define 6b,k — 1 if < b > p is greater than < k > p and 5b t k = 
otherwise. Then we have 



0<i<k 
= &( mod p) 



s 
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Proof. Note that (|3.2[) is direct from Lemma l3~2l by setting a = q/p — 1. Since it is 
similar to that of Lemma 13. 2\ we omit the proof of (|3.3p . □ 

We extend the equation ()3.3() by defining S(k, b) = for b $ F p and any integer 
k. Note that S(k,b) < (|fe/pj 2 )- hi the following theorem, we give the accurate 
formula for N(k,b,D) when D = F q \{ai,a 2 } and first note that we can always 
assume a± = and 02 = 1 by a linear substitution. 

Proof of Theorem 11.31 Using the simple inclusion-exclusion sieving method 
by considering whether a 2 appears in the solution of equation (1.1) we have 

N(k,b,F q \{a u a 2 }) 

= N{k, b, F q \{ ai }) -N(k-l,b- a 2 , F g \{a u a 2 }) 
= N{k, b, F q \{ ai }) - {N(k -l,b- a 2 , F q \{ ai }) 
-N(k - 2,b - 2a 2 ,F q \{ ai ,a 2 })) 



k-l 



J2(-lYN(k - i,b - ia 2 ,F q \{ ai }) 



i=0 

+ (-l) k N{0,b- ka 2 ,F q \{ ai ,a 2 }). 

One checks that the above equation holds if we define iV(0, b, D) to be 1 if and only 
if b = for a nonempty set D. Noting that a\ = we have 

k 

N(k, b, F q \{ai,a 2 }) = ]T(-l) 4 7V(fc _ b _ ia 2 ,F*). 

i=0 

From Theorem 1 1.1 1 we have the following formula 



N(k,b,K) = -{ k ) - - q (-i) k v(b)Rl 

where R\ = -(-l)l k M v(b) = -1 if b ^ and v(b) = q - 1 if b = 0. Thus 

JV(M > F 9 \{ai,a 2 }) 

= ;((-!)' EM)'"' ('I')- (-i)' £ 

y \ fc-i=0 ^ ' fc-i=0 



j=0 v 3 7 j=o 



1 / (q - 2 N 



(-l) fc ^t;(6-fc a2 +j a2 )i? J 1 



A- 

The last equality follows from Lemma [3. 11 Noting that a 2 = 1, and by the definition 
of v(b) we have 



N(k,b,F q \{ ai ,a 2 }) 
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=K'* 2 )"^ ( " i) *g" <i, " fc+i) - R3 (3 - 4) 

y v 7 y 7=0 0<j<fc 

= K 9 ~k 2 ) + ^ ( ~ i)fe ^ ~ ■ fc ~ h) - (3 - 5) 

The proof is complete. 

Combining (|3.4|) , l|3.2[) and ()3.3|) we obtain the following simple solution number 
formula compared with those stated in Theorem 11.21 and Theorem 11.31 

Corollary 3.4. If < fe > p = p—1 and b E F p , then we have 

W(t , i , F , W1}) .i(»-) +( - ir ^iz£( S / r -). 

This shows that the estimate in Theorem 1.1 is nearly sharp for q — n = 2. 



4. Proof of Theorem 11.11 

Let D — F g \{<2i, a2, • • • a c }, where ai, a2, • • • a c are distinct elements in F q . In 
this section, based on the explicit formula of N(k, b, D) for c = 2 given in Theorem 
II. 3| we first obtain a general formula for c > 2. Then we give the proof of Theorem 
11.11 The solution number N(k,b,F q \{ai,a,2, ■ ■ ■ a c }) is closely related to the F p - 
linear relations among the set {ai, • • ■ a c } which we will see in Lemma 14.21 For the 
purpose of Theorem II. li s proof and further investigations on the solution number 
N(k, b, D), we first state the following lemma. 

Lemma 4.1. Let R\ = — ( — 1) L fc /pJ (^yZj 1 ) . For c > 1 if we define recursively that 
R c k = Y]j—Q R C j _1 > then we have 

Proof. When c = 2, this formula is just the definition of R\. Assume it is true for 
some c > 2, then we have 

fc 

i=0 

- B-»-§- 1 »-c + ::r j )(t/; J 1 ) 

- -B- 1 »-|:(' + ::r j )CL 1 ) 
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k 



§<-» u/w O + ::rO(iW> 



The last equality follows from the following simple binomial coefficient identity 

(j + n\ _ (k + n + 1 N 
kS n )~\ n+1 

a 

It is easy to check that when k > we have 

c 

iV(fe, 6, D) = N{q -c-k,-b-^2<ii, D), 

t=i 

where I? = F g \{ai, a 2 , ■ • ■ , a c }. Thus we may always assume that k < In the 
following lemma, for convenience we state two different types of formulas. 

Lemma 4.2. Let D — F g \{ai, a 2 , • • • , a c } and c > 3, where a\ = 0, a 2 = 1, a 3 , • • • , a 

are distinct elements in the finite field F q of characteristic p. Define the integer 
valued function v(b) = — 1 if b ^ and v(b) = q — 1 ifb = 0. Then for any b £ F q , 
we have the formulas 

N(k,b,D)-U«- k C 



-V^'EE- E v(b-ha c (fc-5>)a 2 )i?J (4.2) 

^ i 1= 0i 2 =0 i c -i=Q 3=1 
1 fc k—ii k—ii i c -3 

= (-^-(-if'EE- E 

^ i 1= 0z 2 =0 ic-2=0 

c-2 c-2 c-2 

s '( fc -E^' fc_ E^ " 6 +E*J ac + 1 -^' ( 4 - 3 ) 

3=1 3=1 3=1 

where zs defined by 13.2(1 . and S(k,b) is defined by \3.3]) . Moreover, if a\ = 0, 
and 6, a 2 , • • • ,a c are linear independent over F p , then we have 



N(k,b,D) = -\^ k j+-(-l)*R£. (4.4) 

Proof. Using the simple inclusion-exclusion sieving method we have 

N(k,b,F q \{ai,a 2 , ■ ■ ■ ,a c }) 

= N(k, b, F g \{oi, 02, • • • , a c _i}) 

-N(k -l,b-a c , Fg\{ai, a 2 , • • • , a c _i}) 



y^(-l)'iV(fc -i,b- ia c , Fg\{ai,a 2 • • • , a c -i})- 



i=0 



When c = 3, noting that a 2 = 1, (|3.5[) implies that 
N(k,b,F q \{ai,a 2l a 3 }) 
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£(-l) 1 [\ (l J) + -qi-^Rl-i - (-l) k - l S{k -i,k-i-(b- m 3 » 

- ( q 1 3 ) + -(-v kR * - E s ( k - i > k - 1 - b + 



By induction, (|4.3p follows for c > 3. Similarly, (|4.2p follows from 

If b, a,2 = 1, CI3 • • • , a c are linear independent over F p , then first note that b ^ F p . 
Thus, when c = 2, by its extended definition we have S(k, k — b) = for any 
integer k. When c > 2, since 6,02 = 1,03 ••■ ,a c are independent, we know that 
k — YTj=i h ; ~ b + Ei=i h a c+i-j ^ F p for any index tuple (ii, «2, • ■ • , ic-2) in the 
summation of (|4.3|) . Thus this summation always vanishes for any c and the proof 
is complete. □ 

Now we have obtained the two formulas of the solution number N(k,b, D). It 
suffices to evaluate and the summation in (|4.3|) . which is denoted by S% . Un- 
fortunately, S% is extremely complicated when c is large. The NP-hardness of 
the subset sum problem indicates the hardness of precisely evaluating it. In the 
following lemmas we first deduce a simple bounds for i?£ and S%. 

Lemma 4.3. Let p < q. Let 

k k-ii k—ii i -3 c— 2 c— 2 c-2 

= £ s'O-E^-E^ - 6 +E i J ac + i -j)- 

il=0i 2 =0 i c _ 2 =0 j=l J = l j=l 

TTiera we /lave 



Proof. By the definition of R% and the proof of Lemma 14.11 we have 

k k—h k~ii j e _3 c-2 

** = ££••■ E ^(*-E<i). 

ii=0i2=0 i c _2=0 j'=l 

where R 2 (k) — R\. From (3.2) and (3.3) it is easy to check that 

Rl-qS(k,b)<(q-p)^~l 
for any 6 G F q when p < q. Therefore (|4. 5[) follows since both the two numbers of 

1 t^Vio fur/"* cnmmcilintic r\f F? c crnrl Q c q ro 

c-2 



terms appear in the two summations of i?£ and are ( k ~^l 2 2 )- '— ' 



Next we turn to giving a bound for Unfortunately, even though i?£ can 
be written as a simple sum involving binomial coefficients, it seems nontrivial to 
evaluate it precisely. Using equation (|4.ip and some combinatorial identities, we 
can easily obtain the following equality 



c- 1 J V c- 1 



Ri __ _-- ( _ 1)j[rc - 1 - irt ^.-x — , 

'< ft > p +c-l\ fq/p-1 
c-1 ){[k/p\ 



(q/p-l 



(4.6) 
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It has been known that the simpler sum 

f 2n - 1 - 3i 



^ M) V n-1 J\j 

which is the coefficient of x n in (1 + x + x 2 ) n , has no closed form. That means it 
cannot be expressed as a fixed number of hypergeometric terms. For more details 
we refer to ([4j, p. 160). This fact indicates that i?£ also has no closed form. 
Thus, in the next lemma we just give a bound for R'f, just using some elementary 
combinatorial arguments. 

In Section 2 we have defined the unimodality of a sequence. A stronger property 
than unimodality is logarithmic concavity. First recall that a function / on the real 
line is concave if whenever x < y we have f((x + y)/2) > (f(x) + f(y))/2. Similarly 
a sequence oo,flti • • ■ , a n of positive numbers is log concave if loga^ is a concave 
function of i which is to say that (logaj_i + loga i+ i)/2 < log a.;. Thus a sequence 
is log concave if ai-idi+i < af. Using the properties of logarithmic concavity we 
have the following lemma. 



Rt<p-^r + r 2 ~ j )( q i p .rA (4.7) 



Lemma 4.4. 

o<j<k \ c-2 J\ [j/p\ 
Proof. It is easy to check that both the two sequences C^I^ "0 an< ^ {[f/pf) are 

log concave on j. Thus the sequence dj = ( fe+ c-2~0 iy/pl) 1S a ^ so concave 011 
j by the definition of logarithmic concavity . Since a log concave sequence must be 
unimodal, {% } is unimodal on j. Then we have 

R% = -£(-i)li/*J 0i 

Yk/p\ \k/p\ Lk/Pj-I 

= - (~^y a w — ~ (-iy a ip+<k> P ■ ■ ■ - y (-i) i ai P + P -i- 

2 = i = 2 = 

Thus (|4.7p follows from the following simple inequality 

k 

E( — lYai < max a,-, 
0<i<fc 

i=o 

and the proof is complete. □ 
Proof of Theorem 11.11 When q > p we rewrite (|4.3p to be 

1 f<l- C \ , !/ 



N(k,b,D) = -\^ k )+-(-ir(Rl~qM c k ). 



Applying (|4.5p we obtain 

1 fq — c 



< g-p/fc + c-2\ /g/p-2> (J 



q\kj q \ c — 2 J \ [k /p\ 

If ai = 0, and b,a,2, ■ ■ • ,a c are linear independent over F p , then S%. = for any fc 
Thus from (|4.4[) and Lemma 14.41 we have the improved bound 

1 ^ g — c 

g o<7<fc V c - 2 7 V b'/pJ 



N(k,b,D)- k 



<^max f fc + C - 2 - i V^- lN )- («) 
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Thus we only need to verify the case q — p. When q = p, from Lemma 14. II we have 

3=0 V / V 

And S(k,b) equals or —1 by its definition given in Lemma \3. 31 Thus from (|4.3|) 
we deduce that 

(p-c\ _ (_l)k /k+c—l\ 

W(M - P) - V ' +( - 1)tM ' «" 0) 

with < M% < ( fe+ r 2 )- Thus 

1 fn-r\ (-l\ k fk, + c-l\ < 



> l/<7~c\ (-l) k fk + c - I 
N(k,b,D^ 1 1 > 1 v ' 1 



q\ k J q \ c — 1 
Note that c = q — n and the proof is complete. 



c-2 



Example 4.5. CTioose p = 2,q = 128, c = 4 and k = 5. Tften i?£ = -6840. Let 

uj be a primitive element in F 12 s. Let D = -Fi2s\{0, oj, uj 2 , uj 3 } and 6=1. Since 
1, w, uj 2 , uj 3 are linear independent, J^.^l i gives t/iat £/iere are N = 1759038 solutions 
of the equation (1.1) compared with the average number ^( 9 ^ c ) ~ 1758985. 

Remark. If one obtains better bounds for S£, then we can improve the bound 
given by (|4.8[) . However, it is much more complicated to evaluate S£ than Let 

t-l c-2 

I = {[h,i2, ■ ■ ■ ,« c -2],0 <it<k-J~]ij,l<t<c-2: b- ^a c+1 -j e F p }. 

Simple counting shows that < |/| < ( fe +f • I n the proof of (14. 8[) we use the upper 

bound |/| < ( k ~^-2 2 ) an d m the proof of (|4.4I) it is the special case |/| =0. We can 
improve the above bound if we know more information about the cardinality of I, 
which is determined by the set b, 02, • ■ ■ , a c . For example, if we know more about 
the rank of the set {6, a%, ■ ■ ■ , a c }, then we can improve the bound given by (|4.8[) . 
The details are omitted. 

5. Applications to Reed-Solomon Codes 

Let D = {xi,--- ,x n } C F g be a subset of cardinality \D\ = n > 0. For 
1 < k < n, the Reed-Solomon code D n k has the codewords of the form 

where / runs over all polynomials in F q [x] of degree at most k — 1. The minimum 
distance of the Reed-Solomon code is n — k + 1 because a non-zero polynomial of 
degree at most k — 1 has at most k — 1 zeroes. For u — (ui, U2, • • • , u n ) € F™, we 
can associate a unique polynomial u{x) G ~F q [x] of degree at most n — 1 such that 

u(xi) = Ui, 

for all 1 < i < n. The polynomial u{x) can be computed quickly by solving the 
above linear system. Explicitly, the polynomial u{x) is given by the Lagrange 
interpolation formula 

, , A ll f .,-'- 'V 
u(x) = y Ui Yf^—, r- 

i=l lljjtiW - X 3J 
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Define d(u) to be the degree of the associated polynomial u(x) of u. It is easy to 
see that u is a codeword if and only if d(u) < k — 1. 
For a given u G F™, define 

d(u,D n k)-= niin -u). 

The maximum likelihood decoding of u is to find a codeword v € D n j, such that 
d(u, v) — d(u, D n k). Thus, computing d(u,D n j.) is essentially the decision version 
for the maximum likelihood decoding problem, which is NP-complcte for general 
subset D C F g . For standard Reed-Solomon code with D = F* or F q , the com- 
plexity of the maximum likelihood decoding is unknown to be NP-complete. This 
is an important open problem. It has been shown by Cheng- Wan [2 [3] to be at 
least as hard as the discrete logarithm problem. 

When d(u) < k — 1, then u is a codeword and thus d(u : D n ,k) = 0. We shall 
assume that k < d(u) < n — 1. The following simple result gives an elementary 
bound for d(u, D n ^)- 

Theorem 5.1. Let u 6 F™ be a word such that k < d(u) <n—l. Then, 

n — k > d(u, D n ^) > n — d(u). 

Proof. Let v = (v(xi),--- , v(x n )) be a codeword of where v(x) is a 

polynomial in F q [x] of degree at most k — 1. Then, 

d(u, v) = n — Nd{u{x) — v(x)), 

where Nu(u(x) — v(x)) denotes the number of zeros of the polynomial u(x) — v(x) 
in D. Thus, 

d(u, D n ^k) = n — max Nd(u(x) — v(x). 

Now u{x) — v(x) is a polynomial of degree equal to d(u). We deduce that 

N D (u{x) - v(x)) < d(u). 

It follows that 

d(u,D nyk ) >n- d(u). 

The lower bound is proved. To prove the upper bound, we choose a subset {xi, ■ ■ ■ ,Xfc} 
in D and let g(x) = (x — x%) ■ ■ ■ (x — x^). Write 

u(x) = g(x)h(x) + v{x), 

where v(x) £ F q [x] has degree at most k — 1. Then, clearly, Nd[u(x) — v(x)) > k. 
Thus 

d(u, D n ,k) <n — k. 

The theorem is proved. 

We call u to be a deep hole if d(u, D n ^) = n — k, that is, the upper bound in 
the equality holds. When d(u) — k, the upper bound agrees with the lower bound 
and thus u must be a deep hole. This gives (q — l)q k deep holes. For a general 
Reed-Solomon code D n ^, it is already difficult to determine if a given word u is a 
deep hole. In the special case that d(u) = k+1, the deep hole problem is equivalent 
to the subset sum problem over F q which is NP-complete if p > 2. 

For the standard Reed-Solomon code, that is, D = F* and thus n = q — 1, there 
is the following interesting conjecture of Cheng-Murray pQ. 
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Conjecture Let q = p. For the standard Reed-Solomon code with D — ¥*, the 
set {u G Fp\d(u) = k} gives the set of all deep holes. 

Using the Weil bound, Cheng and Murray proved that their conjecture is true if 
p is sufficiently large compared to k. 

The deep hole problem is to determine when the upper bound in the above 
theorem agrees with d(u, D n> k). We now examine when the lower bound n — d(u) 
agrees with d(u, D n k). It turns out that the lower bound agrees with d(u, D n ^) 
much more often. We call u ordinary if d(u, Dk lU ) = n — d(u). A basic problem is 
then to determine for a given word u, when u is ordinary. 

Without loss of generality, we can assume that u{x) is monic and d(u) = k + m, 
< m < n — k. Let 

u(x) = x k+m - hx^™- 1 + ■■■ + (-l) m b m x k + ■■■ + {-\) k+m b k+m 

be a monic polynomial in F q [x] of degree k + m. By definition, d(u,D n- k) — 
n— (k + m) if and only if there is a polynomial v(x) G F q [x] of degree at most k — 1 
such that 

U(x) - V(x) — (x - X\) ■ ■ ■ (x - Xk+m), 

with Xi G D being distinct. This is true if and only if the system 

J2 x < = &i> 



=i 

''2, 



Xi t Xi 2 — b 

l<ii <i2<k-\-m 



Xi x ■ ■ ■ X im — b m . 

has distinct solutions Xi G D. This explains our motivational problem in the 
introduction section. 

When d(u) = k, then u is always a deep hole. The next non-trivial case is when 
d(u) = k+ 1. Using the bound in Theorem we obtain some positive results 
related to the deep hole problem in the case d(u) = k + 1 (i.e., the case m = 1) if 
q — n is small. When q — n < 1, by Corollary 12.71 we first have the following simple 
consequence. 

Corollary 5.2. Let q > n > q — 1 and q > 5. Let d(u) — k + 1 with 2 < k < q — 3. 
Then u cannot be a deep hole. 

Proof. By the above discussion, u is not a deep hole if and only if the equation 

x\ + x 2 H h x k +i = b 

always has distinct solutions in D for any b G F q . Thus the result follows from 
Corollary O □ 

Remark. Similarly, using Theorem ll.il a simple asymptotic argument implies 
that when q — n is a constant, and d(u) = k + 1 with 2 < k < q — 3, then u cannot be 
a deep hole for sufficient large q. Furthermore, for given q, n, asymptotic analysis 
can give sufficient conditions for k to ensure a degree-fc+ 1 word u not being a deep 
hole. 
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In the present paper, we studied the case m = 1 and explored some of the 
combinatorial aspects of the problem. In a future article, we plan to study the 
case m > 1 by combining the ideas of the present papers with algebraic-geometric 
techniques such as the Weil bound. 
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